Goto

Collaborating Authors

 self-attentive network


Sandglasset: A Light Multi-Granularity Self-attentive Network For Time-Domain Speech Separation

arXiv.org Artificial Intelligence

One of the leading single-channel speech separation (SS) models is based on a TasNet with a dual-path segmentation technique, where the size of each segment remains unchanged throughout all layers. In contrast, our key finding is that multi-granularity features are essential for enhancing contextual modeling and computational efficiency. We introduce a self-attentive network with a novel sandglass-shape, namely Sandglasset, which advances the state-of-the-art (SOTA) SS performance at significantly smaller model size and computational cost. Forward along each block inside Sandglasset, the temporal granularity of the features gradually becomes coarser until reaching half of the network blocks, and then successively turns finer towards the raw signal level. We also unfold that residual connections between features with the same granularity are critical for preserving information after passing through the bottleneck layer. Experiments show our Sandglasset with only 2.3M parameters has achieved the best results on two benchmark SS datasets -- WSJ0-2mix and WSJ0-3mix, where the SI-SNRi scores have been improved by absolute 0.8 dB and 2.4 dB, respectively, comparing to the prior SOTA results.


SANST: A Self-Attentive Network for Next Point-of-Interest Recommendation

arXiv.org Machine Learning

Next point-of-interest (POI) recommendation aims to offer suggestions on which POI to visit next, given a user's POI visit history. This problem has a wide application in the tourism industry, and it is gaining an increasing interest as more POI check-in data become available. The problem is often modeled as a sequential recommendation problem to take advantage of the sequential patterns of user check-ins, e.g., people tend to visit Central Park after The Metropolitan Museum of Art in New York City. Recently, self-attentive networks have been shown to be both effective and efficient in general sequential recommendation problems, e.g., to recommend products, video games, or movies. Directly adopting self-attentive networks for next POI recommendation, however, may produce sub-optimal recommendations. This is because vanilla self-attentive networks do not consider the spatial and temporal patterns of user check-ins, which are two critical features in next POI recommendation. To address this limitation, in this paper, we propose a model named SANST that incorporates spatio-temporal patterns of user check-ins into self-attentive networks. To incorporate the spatial patterns, we encode the relative positions of POIs into their embeddings before feeding the embeddings into the self-attentive network. To incorporate the temporal patterns, we discretize the time of POI check-ins and model the temporal relationship between POI check-ins by a relation-aware self-attention module. We evaluate the performance of our SANST model with three real-world datasets. The results show that SANST consistently outperforms the state-of-theart models, and the advantage in nDCG@10 is up to 13.65%.